Exploring Performance Limits to Future Instruction-Level-Parallel Processors
نویسندگان
چکیده
In this paper, we examine the relative importance of memory latency, memory bandwidth, and branch predictability on the performance of future processors. We develop and validate a sampling-based simulation methodology that allows us to simulate a large number of design points. Our methodology ensures that the entire execution profile of the application is captured while limiting the errors induced by sampling to less than 2%. We extend our simulation results by fitting the data to analytic expressions of filters. Using the insight gained from these expressions, our simulation data, and known technological trends, we develop an understanding of the factors that will limit the performance of future-generation processors. From our examination, we conclude the following. The amount of instruction-level parallelism exploited by an application changes the relative importance of performance bottlenecks. In systems with less capacity to exploit instruction-level parallelism, memory latency and memory bandwidth limit the performance. However, for systems with much greater capacity than today’s processors and with more aggressive implementations, these two memoryinduced bottlenecks can be eliminated. In contrast, even for such future systems, the inability to perfectly predict the direction taken by conditional branches remains a fundamental limit for both integer and floating-point applications. Interestingly, in contrast to the commonly held belief that floating-point applications are not affected by branch prediction accuracy, we find that branch prediction remains a fundamental bottleneck for all applications on aggressive implementations of future systems.
منابع مشابه
Architecture and Compiler Design Issues in Programmable Media Processors
The processing demands for multimedia applications are rapidly escalating. Many current applications are pushing the limits of existing microprocessors, and the next generation of multimedia promises considerably greater demands. Adequate support for future multimedia requires the flexibility and computing power of high-level language (HLL) programmable media processors. This thesis examines th...
متن کاملExploring Hypermedia Processor Design Space
Distributed hypermedia systems that support collaboration are important emerging tools for creation, discovery, management and delivery of information. These systems are becoming increasingly desired and practical as other areas of information technologies advance. A framework is developed for efficiently exploring the hypermedia design space while intelligently capitalizing on tradeoffs betwee...
متن کاملComparison of features for current commercial multicore
Published by the IEEE Computer Society 0018-9162/10/$26.00 © 2010 IEEE In the past, developers used additional capacity to develop superscalar CPUs with replicated execution units and deep pipelines to exploit instruction-level parallelism. However, they only harvested about 25 percent of the additional chip space that became available per year by adding new architectural features.2 Moreover, t...
متن کاملLimits of Instruction Level Parallelism with Data Speculation
Increasing the instruction level parallelism (ILP) exploited by the processor is one of the key issues to boost the performance of future generation processors. Current processor organizations include different mechanisms to overcome the limitations imposed by name and control dependences but no mechanism targeting to data dependences. Thus, these dependences will become one of the main bottlen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005